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•Abstract 



Given known item parameters-, unbiased estimators are derived 

* 

1) for an examinee's ability parameter 0 and for fiis proportion- 

"2 ' * - 

correct true score £ , 2) for s the variance of- 0 ft9 across examinees- 

0 , \ 

> 2 
in the 'group tested, also for s , and 3) for the parallel-forms 



reliability of the observed test score, the maximum likelihood estimator, 

/N t 

e . . 



1 ■ 



J % 



Unbiased Estimators 



.Unbiased Estimators of Ability Parameters, of Their ^ariance, ' . 

and of Their Parallel-Forms Reliability > . / 



This paper is primarily- concerned" With determining the statistical* 
bias in the maximum likelihood estimate '0 df the examinee al&lity 

< ■ I • - / ' -w * ' ' 

parameter 9* In item response theory (IRT) [Lord, T980];^also of 
certain^ functions* of such parameters . W§ will deal only with um- 

dimensional tests comp'osed of dich^tomously scored 'items- We assunie 

T - * ■ 

' * y 

the item response* function i? three-parameter logistic?* (2) - - 

Available results for the sampling- variance of' 0 are -currently; 
limited to the case where the item- parameters are knowai. the. present j 
derivations are limited to this case also. This limitation 'is tolerable 



tejjcr 



ietiva 

V - j / 

in situations where the itfem parameters are predetermined, as* in i 

banking and tailored testing v ' ^ ^^J^ 

. In the absence of a prior distribution fo'r 9 , it is well knowu 

that examiriCes with perfect scores have * 6 = 00 ; also' that examinees 

who perform near "or below the ehance level on multiple-choice itjems 

may be given large negative values of 0 . This (correctly) suggests 

At » ' ■* . 

that 6 is positively biased fot high-ability examinees and negatively 

. I * 

biased 'for low-ability' examinees'. ^ Will a correction of 0 for bias 
be Jielpful *in such*cases? . * 



. l . *This wcrrk was supported *in part by .contract N00Q14-8Q-C-0402 , 
project designation NR 150-453 between fche Office of Naval Research $nd\* 
Educational Testing Service. Reproduction in whole *or in part is permitted 
nfdr any .purpose bf tHe United[ States Government. m t ' 
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It is also 'well known 1 - that for any ordinary group* of examinees, 

) 2 - 
the variance C s* ') of 0 across examinees is larger than the variance 

» 0 « ■ , 

f 2 / * '22 • 

( s ) fit % the true' 0 . The ratio s-/s rt 'is closely related to the . 

' .0 r 00 . y • 

* 

classical-test-r'theory reliability x of 0 considered as 'the examinee's 

\ » „ 

• . " 22* 
test .score. Thus it is not enough for us to know that js* s as the % 

T / * * "00 

number n of test items becomes jLarge; we need to kiiow how, the rela- * 

- 2 . 2 • ' '-^ " * 

tion of Sa to s /varies as a function of n . We also need a 

0 0.*. 

2 • * r * 2 

better estimate of,, s than ""its maximum-likelihood -estimator s- . 

; • . * • 2 V_ 

These objectives can be achieved by Correcting s- , for bias. 

» 9 * ' ' • 

The methods used to derive formulas for correction for bias are ^ 

presented here in detail for at least t^o reasons,: 1) experience 

- / 
with^ similar derivations has shown that it is easf to reach erroneous 

results if details are noi: spelled out. 2) The general methods used 

here are easily" transfer-red to'solve other problems, such as a) cor- \ 

* • , • " * * . • ' 

rectiojti of'item parameters far- bias* h}' obtaining higher-order approxima- 

'*''.' \ - ' ' ~ * J 

tions to the sampling variaoce of 0 . ' 

/ - . • . > . • 

^ ^ » % l. t Statistical Mas "in' 0 and g « ^ * 

The method used here to find the bias of 0 is adapted from the 
'adjusted order .of ^magrfLtude' pro.cedure detailed by^Shenton and Bowman 
(19/^7) . They assume ^their data to be a sample from a population divided into 
a denumerable number of subsets. For -them, the population proportion 
pf. observations *in a "given subset is a known function of the param- . ^ 
,eter 0 whose value they wish to estimate. Their sample estimate of 
0 is ( therefore a function -of observed sample'- proportions in the 
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various sut>sets^ Since our data do not readily fit this picture, we 
cannot use their final published formulas but must instead derive our 
own,. ' - * , ' v 

Throughout Section 1, we deal With a. single fixed examinee whose 
• ability e is the parameter to be estimated. All itein parameters are 
assumed known. 



1*1 Preliminaries 



X 



The maximum likelihood estimate 6 is "obtained by solving the like- 

\ 



lihtfod equation 



' - * ( - U i " P i )P i /P A = 0 * • ' .•-•<!> 

where a }• 0„or 1 is' the examinee's response td £tem ^ i, ( i = 1,2/..., 

n )> = is the response function' for item i *, = 1 - , 

P i , is the derivative of P with respect to 0 , and a car^t indicates 

that the function is to be- evaluated at 6 . We deal with the case 

- <where P is the three-parameter logistic function C 
1 l 

/ 7 t - * 

1 - C. ' • • 

i " C # -A (6-b ) , 

* * t / 

where A.^ > ' b i*> and are item parameters, describing item - 1 . 
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We will assume 

0 

1. 8 is a bounded variable, 

2. the item parameters a_^ and b_^ are bounded, 

3. c. "is bounded away from 1, 

(thus - and Q^. are bounded away from 0 and 1); 

4., as n becomes large, the statistical characteristics of the 
test stabilize-. 

Rather thail trying to define this last assumption formally, the reader 
may substitute the more restrictive assumption usually mad^ in mental 
£est theory: that "a test is lengtherfed"by adding strictly parallel forms, 
With l:hese assumptions, the conditions of Bradley and Gart (1962) - 
ed, Tt follows from their theorems that 6 ^is a consistent 
f a and that /a (6 - 6) is asymptotically^normally distri- 



are satisf 
estimator c 
buted with 



' 1 n 2 *" 

mean zero ar^d variance lim - E^^^i • The existence of 

r n * ><x> 
thf# limit lis guaranteed tiy as^umptioifA. * 

For compactness, we~wi^Ur rewrite (1) a§* 



*2 • ' 

wtte'fe by definition 




.(3) 
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6 ' 

: r ,ii = (u i " p i )p i /p i Q i • • . w 



Now L- considered as a function of 0 . can be .expanded formally in 
1 ' 



powers of 8 - 8 » ?s follows: 



. L : E L. + *(0 - 0) I \ (0 - 0) 2 E/K, + 

i i 21 2 'i 31 



where we define 



s \x 1-u. 

r . = — log P.V 1 ( s = 1,2,... ) . (5) 

SI 8 1 i . 

This- definition is* consistent with (3) . 

• Let x-0-0, T E I r . . Rather than' proving the con- 
■ . 8 ' i 81 . 

vergenge' of tfre power series', let us use a closed form that is always valid 

' ' k = F l + xF 2 + I X ' r 3 + 1 *\ + k 5 • ' (6) 

• where f = Max T and | <S | < 1 . • , 

• . 'V- — . 

j 1.2 Derivatives and Expectations* - . ^ 

To proceed further, it is necessary to evaluate the I\ . 1 It is 
found that 



\ ; 9* 

ERIC 7 • 
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. n P] k-l -Q. ' u.c®. /■ 

. 8=1 i . P 

where ^ ^ is a Stirling number of the second kiftd (Jordan, 1947, \ 
pp. 31-32, 168), 
Define— 



Y . E SX . , ' * , : (8) 
• si si 

• e . = r . - ST '. (9) 

si si si ^ s 

Since Su. - P , we find that. 

li , 

y u --o , ' do) 

' p.- 2 

V - , LI ' ' ' (.11) 

2i p Q " .' * 

i i 



a 2 ' K 



Y 3i ^_ T ^ C i> [2 * p2 i " C -i> - P i (1 " C W ] ' < 12 > 

JJ -(1 - c.) P. 



1 1 



^ii = r ii= (u r- p i )p I /p i Q i 



(13)- 



■2i (1 - o p 2 
1 i 



-Let 



/ 10 



/ 



Unbiased Estimators 
8 



We will denote the Fisher information by ^ ' / % 

. ' -I = -g(dL,/de) = -ny 9 = Z. P! 2 /P.Q, . ' (1,6) 

i i 



Setting (6) equal to zero,- the likelihood equation can now he written 



in terms of the 'y and the & as 
s . s 



- h = x(y 2 + c 2 ) + i x 2 (y 3 + e 3 ) + | x 3 (y 4 + y + ^-'x 4 ^ . (17) 



We*will*need some information abopt- the order of 'magnitude of the 

terms such*as Uiose in (17); It^tnay be seen from (7) that each " e 

< . ' ' s 

has the form % \ * 



e - - Z K . (u - PJ 
s n • si, JL V 



wherp K j does not depend on n or on u , Since P , Q and 
si .i i i 



•1 - c 



^ are bounded, the and thus* £ g is bounded. By assumption 



(4), the bound does ,not depend on n . -The same conclusion holds for y 

• s • " T. s 

Since /ii x is asymptotically normally distributed with, zero mean 

and finite variance, it follows XhaT £x P '( r = 1-^2,./. ) is of order ' % 

*— r/2 * r > r— r t 

n . .A similar- statement is true of /n e Thus finally Sx e £ 

x ^ s « s 

(gx 2 ^) 17 ? so that" gxV'.is of order n "( r+t )/ 2 ( r , t = 1,2,../). 



? 



9 
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1.3 Eirst-Order Variance of 6 s . 

: : : 7 . : * + m ' % 

' To clarify the procedure, let us derive from (17) the familiar 
formula for the asymptotic variance of 0 . Square (17) anc^fake f 
expectations to obtain , ' * f * 



*cj = Y2 * + 2 Y 2 g x 2 + g x 2 e 2 + \ Y3 £ x 3 + Y 2 g^x 3 + . . . / v (18)* 



" f -1* V - -1' 

' If we, wish to neglect terms 'o(n . ) (pf higher order than- n >, 



equation (18) bec6mes 



£x 2 * 4'5 e r + o(<1 1} 



a9> 



By (13) and (16), becatise o$ local independence, 

p • p t 

• ' n I i x i . . . J J. J J J 



p i!i 



\.-, 



„2 



.Var u, 



v 



4 -1 i 



2 



.'12 
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Thus, finally 



1 -l r 
Var e * j~ + <>(* ) 



(21) 



a well-known result. It is derived here to ^clarffy the reasoning to 



be used subsequently.. If 6 is substituted, for 9 on the right side 
,of t .(21)% the formula will still be correct to the specified order of 



approximation, 1 

» < 

1.4 Statistical Bias of 8 # 

Take the expectation of (17) to obtain 



Vi"Vi x + g if2 + iVi x2 



(22) 



-1 



where g indicates an expectation in which only terms of order n are to 
be retained; *A^o multiply *<?17) by and take expectations^*) obtain 



- 5 l £ i e 2 = *2*L~2 



(23) 



By (9) 



Se = 0 
r 



From (13). and (14) 



r = 1,2, .. . 



(24) 



Unbiased Estimators 
11 . 

./ • ; 

• -A s ^.± • "' ' (25)" 

~ 2 "(L-c.) _2 

Substituting (16'') and (25) into (23), we hav^ the covariance . 

• 2' 

a.c. pr 



£ X£ = — T - 1 1 -la ? (26) 



5 1 X£ 2 " nl J "(l-c,) „2 ' V . . , 



i 

Finally, substituting x (16$ , (21) j (24), and (26) into (22) and solving 



for'^x^ , we have the bias 



k J c J p» 2 



B (6). 5 S (Q - 6) -4 ( E ^4- + ^.) V. , (27) 
1 1 I 2 . i 1 " c i P 2 2 3 



This may be rewritten # as 



1 ?. 1 



h ft) 7 1 i;Vi ( »i"T ) (28) 



where 



Unbiased Estimato 
12 * , 



Since I Is of order n f Bj(e) is of oirder n . It, may be of 



interest to+ifote- that in the special case where all items are equivalent 
i 

1.5 Numerical Results t 



(all are the satae) , the' bias simplifies to 3,(0) = P/nP T # 



A hypothetical test was designed to approximate the College • 
Entrance Examination Board's Scholastic Aptitude Test, Ve/ba^ection, 
This test is .composed of n « 90a five-choice items. Some information 
about the distributions of the parameters of the 90 hypothetical items' 
is given in Table lL 

^ # The standard error and bias* of 0 w£re computed from (21) and 
from (27) respectively for various values of 0 . The results are 
shown in Tabl* 2. It appears that the bias in 0 jls negligible for 
moderate values of 0 , but is sizable* for extreme values.' Note that 

O ^ ft 

the bias is positively correlated with 1 6 .- Because of guessing, zero . 
bias does not occur at 6 -* 0 but at 6 = .34 approximately. 

. ' . 

v 1.6 Variance and Bias of 'Estimated True Score ' ' 

• <n ^ '* 

4 - $iflce the ability scale is not unique", any mohotqnic transformation 

<jf j£ £an serve as*a measure of ability. Two transformations are 

particularly juseful.: e ;and 



i=l 



the proportion-correct true score (the number-right true score divided 



ERIC i , ' - „>15 

air i riiTTii m itlu 
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TABLE 1 

Range and Quar tiles of the Item Parameters 
in 90-ItenTHypathetical Test 



0 


•a i -A i /l.7 




!i 


Highest value 


1.88 

r-* 


. 2.32 


.47 




. 1.07 


1.15 


.20 


Median 


^ .83 

* 


.38 


.15 




.69 


-.41 


. .13 


Lowest value 


.41 ' 


-3.94 


.01 



c 

\ 



16- 
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v 

by the number of items). ■ *0ne important reason for using ttie^Tatter 
•transformation is the following. 

> 4 

Ordinarily, as in Table 2, we findl^rge standard errors of* 

9 where *9 is extreme. ^s\xa^L^ these large standard errors are'nq 

more harmful to the user thart are the smaller standard ^errors found 

// * >j« 

when e is near the^ level aimed* at by the test. There* is a reason 

♦ * ' 

why this is so: yLf it were not, the user should have designed his test 

** . 
so as to reduce those standard errors" that were troublesome to him. 

We see that from this point o'f view »the size of a difference on 

the 9 scale doa§ not correspond to its importance. The discrepancy 



is greatly reduced , however , if we measure ability on the C scale 1 

1 



instead of on the 9 scale. This is one reason, among several^'why 



we are interested . in the variance and bias of A 



C E Z.P,(6)/n . (31) 

l 1 1 



Although the proportion-correct true score 



z E E ,u /n ■ (32) 

■LA* , 

» « 

is an unbiased estimator of * s , z is never a fully efficient 
estimator o£ e^^nless c = 0 and a = a. ( i ,j = 1,2, . . . ,n ) : 
the sampling variance ^ \ 



«. n - 

Var z = ±? I P.Q. (33) 



2 , . i x i . 
n- i=l 



'4. 



17 
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v . v TABLE 2 ' , 

Standard Error and Statistical Bias in 0 

: ■ ■ / - I 




18 
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16 



is not as small as the sampling variance of t, , which we must now 



derive. 



By (31) 



d? = - I P'de ' . • ■ (34) 

• 1 n 1-1- 1 . ..... . 

Using the 'delta 1 method - » 

1 v 

i n ■ 2 

Var 5 = — ( I P*) Var 6 • 
n i=l 



By (21) and (16) 

V •> ' ■ ' 

Var I = — * =- . * . - \ (35) 

P 1 

«^ r 1 
• n l 



i p i Q i - « 



To find the bias of £ l , we expand it in powersf of x E 6^ - 0 f 

At • 

* * n 1 2n *- -i - v 
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where 



- pv = d p./de 

i i 

Taking expectations, and neglecting higher-order terms, we have for 
the bias 

B 1 (C) = .£<? - C) -^JB(6) EP^+^Var g(ZPj)] ..V v (37) 

) • - 

This can be rewritten as 

• • 

B^g) - 4 ( E ■ AiCj?i . + | I y ) + £ N (38) 

1 r a - c.^ 2 i 3* , 21 . 

- \ 

7 * 
where C 1 = S^/n and 5 " = E ± Pj7n . Let us note is passing that when 

all items are equivalent (all' P^e) are the same); 5 = 2 and, its 

bias (38) is 2ero. ' . * 

\ 1 < 

1.7 Numerical Results \*' , ' 

Table 3 shows the bias in \? for the same hypothetical test con- 

sidered in Section 1.5..* The biases are # all positive. However, th€>y are 

negligible at all except the lowest ability levels. This tends to con-' 

firm our choice of the scale of ability rather than the- 6 scale 

for many purposes. % 



\ 



20 • . x - \ 



X 
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TABLE 3 



.Standard Error of z and of^ C ,^ 
and Statistical Bias of ,£ 



- . 1 
it ■: i 






£ • 




/ Var z 


y a * • 

/ Var C«* 


: A \ 

•B(C),' 






■ 3.5' 


.981 ' 


' .014 


^ .014 


. 00045, 


\ 

1 

• 




-3.0 


.966 


.019 


.018' 


.00052 . 






< 

2.5 


.937 


.024 - 


v .023 


',r00064 


* 




2.0 


.891 


.'.031 


y 

.029 


. 00059 


\ 




1.5 
1.0 


.812 
.715 


.037 
■ ■' .0*2 


- ,, .035* 

*** 

.040 


.00021. 
' .00026 


✓ # 




0.5 


.608 


' .045 . 


.042 


.00061 


- 




0 


..'506 


.046 


. .043 . - 


.00061 




• 


-0.5 


.416 


.047 


.042 


.00062 . 






-1.0 


.344 


.046 


. 038 

i 


. 00061 . 




U 


-1.5 


.291 


.045 


037 

/ > 

-.033 • 1 


,..00085 






-2.0 


.254 


\ 1044 


.0014 




,x . 


-2.5 


.227 


.042. 


. 02*9 


'.0020 

r 




X. 


-3.0 • 


.211 


.042 


• ,025 


.0024 






-3.5 


.199 


' .041 

> 


.021 


V0026 * 














I 


i 




9 






N. 






V — 
♦ 


( 


* 


* » 


V 

z 


» 
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As a matter of incideH^l interest^ for selected values of 'true 
score Table ^3 compares the standard error (35/of the maxiimjm-likeli- 

A - . , % 

hood estimator £ " with the standard error (33) of the unbiased estimator 
z (proportion-correct scored There fs littl^ difference i^accura^y 

♦ - - J. 6 

between the two* estimators for X, > .5 . At low true-score levels, the < 

maximum-ilikelihood estimator is much better,- than the^proportian of 

- " , • ' " -> 

correct answers. ^ \ 

2. Unbiased Estimation of s Q , of Test Reliability * 9 

2 * 2 6 « 
The symbols s and s are used for the sample variance of 

6 ? , / % 

6 and of £ apross the N ^examinees in the sample: 



. a=l a a-1. a ' . • * 



' > ^ : 2 £ L 2 2 

The maximum-rlikeLihood estimators of s c and s_ ar« s* and s; f - 

the simple variances' across examinees of 0 ,and *of/ £ . . 

' \ 2 
2.1 Asymptotically Unbiased^Estimator of g Q 

Assume that our examinees^ar e a random sample of /^N from soma 

' ^2 . -5 * •"•**•• 

population. Denote by o Q the population variance of 6 . Then Hs^/(N-1) 

2 2 ** 

is an unbiased estimator of * a • Since s is un©bs c ervable, our first 

0 9 4 

task is to find a function of 8\ that^is an asymptotically 'unbiased 

2 * 
estimator of a * ^ • * 1 

0 y * 

«/ - • _ . 
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By the formula for the variance of a sunr we have 

» * ~& 9 



2 -2 2 ,2 

6+x 0 x 6x 



(40)* 



* where a denotes a variance across 'all examinees in the populati 
and a 6x ' is the c ? rres P ondin 8 population covarianoe". By a well- 
known identity from the analysis of variance 



on 



I - Vx|e +a g(x|e) 



041) 



where £ Q denotes an expectation across all examinees in the population!' 
Similarly' , 



6x = a 0,£(x|6) 



' (42) 



Substituting (41), and (42) into (40), transposing, writing B E 6 (xle) 
as-in (28), and^-dropping the subscript .frbm ^or . convenience, we 



have 



„ 2 - 2 n c 2 



6B 



e°x|» " a B 



(43) 



J 



Since by (28) B is of order ' n"" 1 , "its variance is of order n~ 2 

2 ' » 

so .0 fi can be neglected in (43). Since Section 1 deals with a single 

2 



fixed examinee, the symbol a~, Q in (43) has the same meaning as 
Var 6 in (21) : 



°x|e " net * - 0(n " 1) - 
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2 -I 
where I = I(eO is given by (16), Since a . is of order n, , the 

effect of replacing 9 by 6 on the right is negligible: 



i(e) 



By similar reasoning, we may replace 1 a Qfi in* (43) by where B 

is defined by (27) with 6 replaced 7 by 6 . The result of these 
approximations is that 



i \ 6 a — ±-+'o(n 3 



°l = 4 - 2a rfi 1 K + * o(n " 1 ) : - * " (44) - 

6 8 <TB 0 

2 ^ 

A useful estimator of a Q can be calculated from 



-2 , »N 2 2N 1 - 1 (45) 



where 



s;; r ^ 2 9-B - ( £ E . 6 J( ~ I B'). 
6B N a a N ., -<r N .a 

a=l <a=l a=l 



and B is given by (27) with 6 replaced by 6 . If we wish tp 

a a * ^ 

estimate the sample variance of ability » s^ rather thari the population , 

*2 * 
variance^ , we can use 

• 6 6 6B N 2 a-1 1(6 •) > ' 

. . a 
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»• The second and third terms of .(44) are of. order n^^-^m order - 

of magnitude .smaller than the first term but larger than the neglected 

terms. - The covariance of 9 ^and. B is usually positive, as' can be 

. .readily seen from Table 2. Since' -1(6) is necessarily positive, it 

2 2 ■* . 

appears that usually o Q < og ,- an inequality that is frequently assumed 

• V 

without proof. It is not clear whether this inequality is necessarily 
true. . 



2.2 The^/Reliability of a ; » 

Consider the parallel-forms reliability coefficient n""' 
the/ correlation between scores 6 and 6' on two parallel tests. * * 

\ f* 

For present purposes, two tests are parallel when_for each = i tern 'in one 

test there is an item in the other^test with the same item response 
> function. L*et us estimate 



- p ee* o-ol " 2 



(47) 



from^a single test administration by substituting asymptotically unbiased 

estimators of the numerator and of the denominator into (47) 

* 

m *As in (41), 



e ee'|e °*(e |e>,S(e »|e) 



(48) 



9 
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*' . / • • '• • 

Bec/uae of locals independence, the first term on the right vanishes. ~ , 

r 

flecause of parallelism, the two expeatat ions in the last term are^ 
identical, so this term is N a variance. We thus hatfe 



a ee' a 6 



From .(49*. and (43), 



(e |e) _ B+e - b e - . 6B • • 



_ 2 e 2 , » ; . , . ^(50) 

We see that the parallel-forms reliability v of 0 p 



<1 



1 * * 1 * , / . i' * (51) 

°e I(e) > * . 

.Priority in-obtaining this result belongs'. to Sympson [Note 11. 
Replacing population values on. the right-by the corresponding sample 
statistics, we have a sample- estimator of the parallel-forms reliability, 
coefficient of 6 : 

i = l _ IL^i ! _i_ . ■ - • , '(52) 

66 ' -*' 4 N 2 s? a-1 HO*- 
*\ d .a t 

Since 6 is neither unbiased nor uncorrected with 6 , we ^ 
should not expecf the usual reliability formulas' of . classical test 
theory to apply. A similar but not identical case is discussed in • 
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2 

'Lord and Novick (196&, Section 9.8). Thus pgg, , pg Q , and 
2 2 

a Q /a- are not interchangeable definitions of reliability. Since 
correlational measures are hard to interpret in the absence of 
liMfSrity and homoscedasticity^ we will 'not now push this investigation 
of reliability further. 



2.3 Corresponding Results for True Score u 

By, the same reasoning used to obtain (44) we have 

2 " 2 . ■ ,22 '* . 

c 5 e,B(0 e c ; b(c) \ 
** _ ^ 

v . . (53) 



2 

A useful estimator of a can be calculated from 



r -.2 
- 2 _ N 2 ^ 2N 1 " Q a .... 

c .H - 1 C . N - 1 c> B(c) N a=1 I( j 



To estimate s , we can use 
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As in (50) - (52) we have 
* 2 • 2 

o AA t ~ Oc " ^a--j v — ? - * (56) 



A 2 

P ;;, - i-^ g ft - L ^ + °(n" 1 ) . ' (57) 

^ A 9 Ke) 



2 - I _ IL^l ! jfeL • . (58) 

?? N s a a=l 1(9 ) 

' ' . . ? 

2,4 Numerical Results for True Scores 

At moderate ability levels, (28) provides adequate but usually 
• /\ 
neglible corrections for bias in 0 . Experience shows that at 

very low ability levels, the usua^est^length^C n ) of 50 or lOfr * 

items is not long enough for the asymptotic results of* (28) to apply. 

For example, an examinee whose true 6 is -3 may easily obtain an 

estimated ability 6 of -30 or of -« For sufficiently long tests, 

.such extreme values of 6 would have negligible probability, but 

with th$ usual values of n , equation (28) is totally inadequate for 



correcting 0 for bias at low ability levels. 

This same difficulty carries over to the unbiased estimation 

- ,\ 

of .07 using (46). Since all ability levels are involved in (46); 
the formula is^useless in. practice for any group that contains even 
-a few low-ability examinee's." Fortunately, this difficulty does not 



28 




/Unbiased Estimators 
*26 



carry over to the estimation of ability mn the true-score ( jpr) 

scale. * ' 

The hy pothetical SAT Verb a l Test o f Tables 1-5 was administered to 



a typical group of - 2995 -hypothetical examinees. The bias in c 
was estimated for each examineetand a corrected ? obtained from 4 
(51): . » ' 

- / : 

corrected C 5 C - B- (?) 

1 ; - 1 .< 

r ***** « 

In a few cases where the corrected C * would have been below the 
chance level -E^^ » the corrected ? was set ,equal to E^c^ . 

The mean. of the 2-995 true C used to generate the data was 
.5280,' the mean of the uncorrected, ? was .5294, the mean of the 
corrected £ was .5288. Thus the correction* was in the right 
direction, but not large enough. The uncorrected mean ? was already 
£o accurate as to leave little room for improvement. 

# Next, (55) was used 'tb estimate s ? . The true value was * 
>s * J610 ,' the standard deviation of ? was s- - .1660 / the cor r 

mate from (55).was § - .1614 .< The correction worked 
very well here. . * 

The parallel-forms 'reliability of ? was estimated from (58) 

* .* * 

to be p*%=~.9420 , We* have no 'true' value, against which this^can 

be compar ed, "but the estimate -seems a reasonable one. the Kuder- # 

Richardson formula-20 reliability of number-right scores, for these 

data .9275.* 
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It should be remembered that both' the formulas and the numerical' 
results in this report apply in situations where the item parameters 
^re known. These formulas may be satisfactory for situations where 



the item parameters have been estimated from large groups not containing 
the examinees whose ability estimates are to be corrected for bias. 
These formulas will not be adequate for situations where the* item 
parameters and ability parameters are estimated simultaneously from- 



/ 



a single data set. 
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